Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes “The model produced invalid content” error when calling functions #3429

Draft
wants to merge 10 commits into
base: 0.2
Choose a base branch
from

Conversation

davorrunje
Copy link
Collaborator

@davorrunje davorrunje commented Aug 27, 2024

Why are these changes needed?

This fixes the issue related to the more strict checking of JSON parameters in function calling as described here:

https://community.openai.com/t/error-the-model-produced-invalid-content/747511

This PR removes the 'name' parameter from messages in OpenAI client. It also introduces an additional parameter in the tool JSON specification that was previously missing.

Related issue number

Closes #3247

Checks

@codecov-commenter
Copy link

codecov-commenter commented Aug 27, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 2 lines in your changes missing coverage. Please review.

Please upload report for BASE (0.2@5ad2677). Learn more about missing BASE report.

Files with missing lines Patch % Lines
autogen/oai/client.py 77.77% 2 Missing ⚠️
Additional details and impacted files
@@          Coverage Diff           @@
##             0.2    #3429   +/-   ##
======================================
  Coverage       ?   29.61%           
======================================
  Files          ?      117           
  Lines          ?    13022           
  Branches       ?     2469           
======================================
  Hits           ?     3856           
  Misses         ?     8819           
  Partials       ?      347           
Flag Coverage Δ
unittests 29.59% <80.00%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@davorrunje davorrunje changed the title Fixes “The model produced invalid content” error with function calling Fixes “The model produced invalid content” error when calling functions Aug 27, 2024
@davorrunje davorrunje requested review from yenif and Hk669 August 27, 2024 12:03
@davorrunje davorrunje marked this pull request as draft August 27, 2024 12:19
@marklysze
Copy link
Collaborator

@davorrunje, thanks for creating this - interesting that OpenAI are also removing name. This is likely to affect group chat with speaker selection. May need to incorporate the recent name transforms as a simpler integration for that (but that's for another discussion/time :) ).

I'll give it a test...

@davorrunje davorrunje marked this pull request as draft August 28, 2024 05:54
@davorrunje
Copy link
Collaborator Author

@marklysze
Copy link
Collaborator

@davorrunje, thanks again for working on a fix for the exception.

Would you have an example that I could use to replicate the exception?



@pytest.mark.skipif(skip, reason="openai>=1 not installed")
def test_chat_completion_after_tool_call():
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marklysze here is an example of failing completion call

Copy link
Collaborator

@marklysze marklysze Aug 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davorrunje, is it possible to get the AutoGen code that would have generated this?

As an update: I ran the params through OpenAI's API (response = completions.create(**params)) and it ran through okay and returned a function call.

With my agent's termination check it failed on checking the content for the termination keyword as content is None:
is_termination_msg=lambda x: True if "FINISH" in x["content"] else False

The exception: argument of type 'NoneType' is not iterable

Updating my termination expression corrected that:
is_termination_msg=lambda x: True if x["content"] and "FINISH" in x["content"] else False

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marklysze I marked the PR as a draft as I am not sure yet what kind of workaround is the best one. The issue seems to be on the OpenAPI side and it is more common with GPT-4o-mini than older models. Maybe we should remove names only if we get an exception? Even changing the system message a little bit helps sometimes. The list of possible workarounds is (https://community.openai.com/t/bizarre-issue-preventing-response-from-gpt-4o-mini-the-model-produces-invalid-content/875432):

  • Remove tools and tool_choice args.
  • Slightly reduce the length of the prompt, even by a couple of words.
  • Add at least a second user message or an assistant message.
  • Change the name or remove the name parameter (but that didn’t always work, depending on the prompt). Some names work, others don’t. ‘Gregory’ worked, for example.
  • Change the message. Some work and some don’t. ‘Hello.’ didn’t work, but ‘Hello, friend.’ worked.
  • Change to gpt-4o or gpt-3.5-turbo.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @davorrunje, appreciate the detail and the note on the draft.

It is definitely tricky because removing the name is a considerable change and could affect people's existing code - at the very least I'd suggest it is made as an option and defaults to existing behaviour of leaving it in.

From the list you provided, I think the third point would be the safest to do to minimise the impact on the validity of the messages. In the following linked LinkedIn post they added a message "DO NOT PRODUCE INVALID CONTENT" and that seemed to fix it! :).

LinkedIn Post on this being intermittent behaviour.

I would love to be able to replicate it, have you got any other examples that you can get to throw an exception?

I wonder if rather than change the messages to start with, we run inference and if it throws that specific exception then it adjusts the messages (perhaps with 3rd option above) and tries again up to x times.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think we need to add a simple hack first and see how it goes. I see this error quite often when working with function calls and GPT-4o and GPT-4o-mini. I'll try adding a message as suggested first.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, not sure if it's possible but if you do get the exception and are able to get the params["messages"] it would be interesting to see if it is replicated.

See how you go with the additional message.

@lmcmahi
Copy link

lmcmahi commented Aug 29, 2024

Hi ,
I have the same error "The model produced invalid content” error when calling functions", how I can apply this fix, any idea ?

@marklysze
Copy link
Collaborator

Hi , I have the same error "The model produced invalid content” error when calling functions", how I can apply this fix, any idea ?

Hi @lmcmahi, would you be able to provide a code sample that produces this error? It would help in testing out viable fixes.

@zhwuwuwu
Copy link

Hi, I've encountered exactly the same error, having you find a good solution?

@ekzhu ekzhu changed the base branch from main to 0.2 October 2, 2024 18:25
@jackgerrits jackgerrits added the 0.2 Issues which are related to the pre 0.4 codebase label Oct 4, 2024
@rysweet rysweet added the awaiting-op-response Issue or pr has been triaged or responded to and is now awaiting a reply from the original poster label Oct 10, 2024
@fniedtner fniedtner removed the llm label Oct 22, 2024
@ekzhu
Copy link
Collaborator

ekzhu commented Oct 24, 2024

@davorrunje would you say the proposed fix is still necessary after gpt-4o-2024-08-06?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.2 Issues which are related to the pre 0.4 codebase awaiting-op-response Issue or pr has been triaged or responded to and is now awaiting a reply from the original poster
Projects
None yet
10 participants